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ABSTRACT 


Suppose  {X^}  is  a  sequence  of  random  variables  (r.v.  s)  with  means 

UL  and  variances  f  (ii  )  .  Attempts  to  find  r.v.  s  T  =  T  (X  )  with 
'^n  n'^^n^  n  n  n' 

approximately  constant  variance  have  focussed  on  r.v.  s  of  the  form 


Curtiss  (Annals  of  Math.  Stat. ,  vol.  14,  107-122)  proved  a  fundamental  theorem 
which  gave  a  sound  theoretical  basis  for  transformations  of  the  form  (1).  This 
note  gives  several  generalizations  and  applications  of  Curtiss's  theorem. 

Typical  is  the  following; 

Suppose  the  sequence  of  r.v.  s  {X^  -  |i^}  converge  in  probability  to  0  , 
the  sequence  of  real  numbers  {p,^}  converges  to  a  finite  limit  and  the 
distribution  functions  of  {(X^  -  ®  positive  constant)  converges  to  a 

d.  f.  F(w)  .  If  the  real-valued  function  of  a  real  variable  ^(x)  has  a  continuous 
derivative  §'(x)  which  does  not  vanish  at  x  =  (x  ,  then  the  d.f's  of  the 
sequence 

] 

1  ev)  “nj 

converge  to  F(w)  . 

Standard  theorems  of  real  function  theory  and  standard  techniques  of 
probability  theory  are  employed. 
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1.  Introduction;  Suppose  {X^}  is  a  sequence  of  random  variables  (r.  v.  s) 
with  means  ix  and  variances  f  (u.  )  .  Attempts  to  findr.v.  s  T  =  T  (X  ) 
with  approximately  constant  variance  have  focussed  on  r.v.  s  of  the  form 


The  heuristic  argument  usually  advanced  for  such  a  transformation  involves 
approximating  T^(x)  by  the  linear  term  of  its  Taylor  series  expansion  in  a 
neighborhood  of  .  Of  course,  heuristic  arguments  are  a  matter  of  taste, 
but  many  people  have  pointed  out  difficulties  connected  with  this  one.  An 
early  reference  is  [3],  a  recent  one  is  [5],  p.  72. 

In  1943,  Curtiss  [3]  proved  a  fundamental  theorem  which  gave  a  sound 
theoretical  basis  to  transformations  of  the  form  (1). 

This  note  gives  an  alternate  heuristic  argument  which  can  be  made  rigorous. 
In  its  rigorous  form  it  is  a  generalization  of  Curtiss's  theorem  and  Implies  many 
of  the  standard  asymptotic  theorems.  Standard  techniques  of  real  function 
theory  and  probability  theory  are  used. 
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2.  An  Heuristic  Argument.  We  follow  Curtiss  in  his  formulation.  We 

consider  a  sequence  of  r.v.s  a  sequence  of  real  numbers  {^^),  and 

a  sequence  of  real-valued  integrable  functions  defined  on  the  real 

numbers,  such  that  the  sequence  of  r.v.  s  =  (X^  -  \i)  J  h®ve 
’  n  n  n  n  n 

distribution  functions  (d.f.  s  )  which  converge  to  a  d.  f.  F(«)  at  continuity 
points  of  F  .  (Let  Y  be  a  r.v.  with  d.f. F.  We  shall  follow  Parzen  [7], 
p.  424,  saying  that  the  law  of  Y^  converges  to  the  1  aw  of  Y  and  writing 

-  ilY).) 

We  consider  the  sequence  of  r.v.s  ^  is  continuous,  then, 

according  to  the  mean  value  theorem  for  integrals,  T  =  (X  -  u  )  «  (C)  for 

'  n  n  n  n*  * 

suitable  4  •  Now  if  is  a  "slowly-changing"  function  will  be 
approximately  (X^  "  ^n^^n^  should  have  /{T„)  -  /(Y)  . 

(Arley  and  Buch  [1],  p.  79],  in  considering  the  problem  of  data  transformation 
have  applied  the  adjective  "slowly-varying"  to  the  function  T^(x),  interpreting 
this  as  a  condition  relating  the  first  two  derivatives  of  T^,  which  they  assume 
exist.  In  making  the  above  argument  rigorous,  we  shall  use  "slowly-varying" 
in  the  precise  sense  of  Curtiss  as  stated  in  Theorem  1,  below.) 
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3.  The  Heuristic  Argument  Made  Rigorous.  We  use  this  line  of  argument 

to  prove  Theorem  1.  We  consider  throughout  this  paper  a  sequence  of  random 

variables  {X  a  sequence  of  real  numbers  {n_},  and  a  sequence  of  real- 
n  “ 

valued  functions  of  a  real  variable  {«>^(x)},  Lebesque  integrable  with  respect 
to  X  for  each  finite  Interval  and  each  n  .  We  shall  be  concerned  with  the 
following  conditions. 

Condition  A;  The  Laws  of  the  r.  v.  s.  (X^  -  |i^  converge  to 

the  Law  of  a  r.v.  Y  . 

Condition  B:  ^  0  • 

Condition  C;  For  an  arbitrary  closed,  bounded  interval  [a,b]  , 


11m 

n*^oo 


=  1 


uniformly  for  xc  [a,b]  . 

Theorem  1.  (Curtiss)  Consider  the  r.v. 

(2)  Tjj  =  /  v>^(x)  dx  . 

If  conditions  A,  B,  and  C  hold,  the  Laws  of  T^^  converge  to  the  Law  of  Y  . 

In  the  argument  below,  we  only  need  condition  C  holding  for  almost  all  x 
One  merely  replaces  "Infimum"  and  "supremum"  by  "essential  infimum"  and 


essential  supremum".) 
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(In  [3],  Curtiss  hypothesized  and  used  the  continuity  of  the  d.f.  F(u)  of 
Y.) 

Proof  of  Theorem  1:  That  T^  is  a  r.v.  is  assured  by  the  measurability  of 
a  continuous  function  of  a  measurable  function. 

Since  **  ojf^(Y),  there  exist  continuity  points  a  and  b  of  F  and 

n^  such  that 

(3)  P(a  <  Y^  <  b)  >  1  -  £ 

For  all  n  >  n^  .  Consider  n  >  n^  .  Ftom  elementary  properties  of  conditional 
probability 

(4)  P(Tj^  <  wla  <  <  b)(l  -  £)  <  P(T^  <  w)  ^P(Tj^  ^  wla  <  b)  +  £  . 


(Here  P(#  I  *)  is  the  conditional  probability  of  the  event  #  given  that  the  event 
♦  holds.) 

Now  the  condition  a  <  Y  <  b  is  the  same  as  the  condition 


—7 — :  +  p  <  X  < 
n  n 


Also 

d 

(6)  ni(d  -  c)  ^  /  f(x)  dx  <  M(d  -  c) 

c 

if  f(x)  satisfies  m^f(x)^M  on  the  interval  [c,d]  .  Thus,  for  Y^  £  [a,b], 

so  long  as  X  >1*  we  may  write 
n  n 

(X  -  n  Jm  <  T  <  (X  -  11  )M 
n  '^rr  n  —  n  — '  n  "^n'  n 


(7) 
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where 

(8) 


=  inf  (pi 


(>i„) 


and 


(9) 


x«[a,b] 


X 

M  =  sup  (p  ( — — :  +  u,  )  . 
"x.(a:b]  "  " 


If  X  <  w,  M  and  m  are  interchanged  in  equation  7.  Consequently,  for 
n  “  n  n 

u  €  [a,b]  , 

(10)  and  >  0 1  a  <  ^b)  +  P((X^  -  <  w  and 

X^-ML^<ol  a<  Y^<.b)<  P(T^^«la^Yj^  “lb)  ^ 

P({X  -UL  )m  <  u  and  X  -u  <ola<Y  <  b)  +  P((X  )M  <  w  and 
"  n  "^n  n~  n'^n  —  n~''  n  "^n'  n  — 

X  -u  <  0|  a<  Y  <  b)  . 
n  "^n  “  —  n  —  ' 


But  (assuming  a  <  0  <  b)  this  reduces  to 

(Ha) 


F  (co  ^  ^  )-F  (a) 
n'  m  '  n'  ' 


M  '  n'“^  n'~  m  ' 

-  <  P(T  <  U 1  a  <  Y  <  b)  <  — 

Fjj(b)-Fn(a)  -  '  n  -  -  n-  F^(b)-Fj^(a) 


when  u  is  positive  and  to 

F  (fa?  (a) 

n'  m  '  n'  * 
n 


F  («*>■  "■  ■)-F  (a) 
n'  M  '  n'  ' 


T,"  fila  •  'T~  <  P(T  <  u  a<Y  <h)<—= 

F  (b)  -  F  (a)  -  '  n  -  -  n  -  '  -  F 

n  '  n'  ' 


(b)-F  (a) 
n  '  n'  ' 


when  u  is  non-positive. 

But  as  an  immediate  consequence  of  Condition  C  we  have 


m 


(12) 


and 


03) 


lim  — : — : 
n 

M 


=  1 


lim  — 
n-»  oo  ’’n 


1  . 


Thus  if  w  is  a  continuity  point  of  F,  the  first  and  last  terms  in  (11)  can  be 
made  arbitrarily  close  to  F(w)  and  the  middle  term  can  be  made  (according  to 


i> 
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(4))  arbitrarily  close  to  P{T^  .  Thus  ^(T^)  -►  <^(Y)  as  asserted. 

Actually  we  can  do  better.  The  heuristic  argument  suggests  that  the 

difference  T  -  Y  should  be  small  in  some  sense.  In  fact,  we  can  prove 
n  n  ' 

Theorem  2.  Under  the  hypotheses  of  Theorem  1,  the  sequence  {T  -  Y  } 

n  n 

converges  to  0  with  probability  1  . 

Proof:  We  let  u  represent  a  sample  point  so  that  Z(u)  represents  the 
value  of  the  r.v.  Z  at  the  point  u  .  With  the  restrictions  preceding  (7)  we 
have,  as  a  consequence  of  (7)  , 


(14)  (X„(»)-^„Xm„-»„(K„))=V»)-Y„(.)<(X^H-,„XM„-.„(^„))  . 


(If  X^(w)<  the  inequalities  are  reversed. )  In  (3)  we  can  always  choose 
a  negative  and  b  positive,  and  we  note  from  (5)  that,  under  the  conditions 
assumed. 


(15) 


—f—,  <  X  (u)-^  <  — p-T 


n”  n* 


From  (14)  and  (15)  we  find  that  1t^(w) -Y^(u))  |  is  no  larger  than  the  largest 
of  the  four  numbers 


,L!!Kll!^nL 

^<“■.1  ’  ’’nK)  ’ 

But  each  of  these  can  be  made  arbitrarily  close  to  0  by  choosing  n  sufficiently 

large.  Thus  for  Y  (w)  <  [a,b],  lim  (Y  (w)  -  T  («))  =  0  .  Thus 

n-^oo 

P((Y^  -  T^)  -►  0)  >  1  -  «  for  each  positive  €  and  the  theorem  is  proved. 
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4.  Large  Neighborhoods  and  Small.  Theorems  1  and  2  require  that  the 
functions  be  nearly  constant  over  arbitrary  closed,  bounded  Intervals. 

They  have  as  corollaries  many  of  the  theorems  requiring  approximate  linearity 
of  a  related  function  in  a  small  neighborhood.  We  prove  several  such  theorems. 

Theorems.  Suppose  the  sequence  of  r.v.  s  {X  -  converges  in 

^  • 

probability  to  0  and  {(X^-n)a^}  with  >  0  for  each  n  converges  in 
law  to  a  r.v.  Y  .  Suppose  |(x),  a  real-valued  function  defined  on  the  reals, 
has  a  continuous  first  derivative  w^ich  does  not  vanish  at  x  =  p.  .  Then  the 
sequence 


j  i’M  “n  I 

converges  in  law  to  Y  .  (Here  and  subsequently  |’(x)  is  the  derivative  of 

e(x)  . 

Proof:  We  suppose  the  law  of  Y  does  not  assign  measure  1  to  a  single 
point  for  otherwise  the  theorem  is  trivial.  Since 


(16) 


«  (l(X  )  -  |(ji))  X  a  e'(x) 

=  J 


!'(»*} 


g’(»x) 


we  need  only  check  that  conditions  A,  B  and  C  are  satisfied  for  the  function 


<p  {x)  =  a  . 

’^n'  '  n  |'(p ) 
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Since  conditions  A  and  B  are  satisfied  by  hypothesis.  To 

check  condition  C  we  observe  that  the  sequence  diverges  to  infinity 
since  {X^  -  |x}  converges  in  probability  to  0  and  (X^  -  converges  in 
law  to  a  non-degenerate  law.  Condition  C  reduces  to 


(17) 


lim 

n-»oo 


n 


=  1 


uniformly  for  x  in  [a,b]  .  But  since  q  diverges  to  infinity,  this  is  merely 


the  statement  that  ^'(x)  is  continuous  at  x  =  (i,  since  for  n  sufficiently 

large  and  all  x  e  [a,b],  — +  is  in  an  arbitrary  neighborhood  of  jj.  . 

n 

We  only  need  the  continuity  of  |'(x)  at  jx  . 

Theorem  3  has  as  a  special  case  the  one-dimensional  case  of  5e.  1  of  Rao  [8]  . 
The  |ji  of  Theorem  3  can  be  replaced  by  a  sequence  which  converges 

to  a  finite  limit  ^  .  For  we  need  only  to  verify  that 

p.(JL  .  iliiL  .  . 

i'i*  )  '‘n> 


(18) 


lim 

n-*<» 


n  gVn) 

IV,) 


=  1 


uniformly  for  x«  [a,b]  .  Again  this  is  just  the  continuity  of  |'(x)  at  . 
Hence  we  have 

Theorem  4.  Suppose  the  sequence  of  r.v.  s  {X^  -  converges  in 
probability  to  0,  the  sequence  of  real  numbers  converges  to  a  finite 

limit  p,  and  with  positive  converges  in  law  to  a  r.v.  Y  . 


#355 


-9- 


If  §(x)  has  a  continuous  first  derivative  which  does  not  vanish  at  x  =  n,  than 
the  sequence 

(  -  K.^,)  \ 

I  V 


converges  in  law  to  Y  . 

Theorem  4  has  as  a  special  case  the  asymptotic  normality  of  "smooth" 

functions  of  maximum  likelihood  estimates  when  those  estimates  satisfy  the  usual 

regularity  conditions  guaranteeing  asymptotic  unbiasedness,  consistency  and 

normality.  Such  a  theorem  is  12.  3.7  of  Wilks's  book  [9] . 

The  question  arises  as  to  whether  one  can  use  a  sequence  of  functions 

fjx)  in  Theorem  3.  Additional  assumptions  are  necessary, 
n 

Theorem  5,  Suppose  the  sequence  of  r.v.s  {X^-p.}  converges  in  probability 
to  0  and  {(X^-p)«^}  converges  in  law  to  Y  .  Suppose  is  a 

sequence  of  differentiable  functions  such  that 


(19) 


Urn  I'  (x)  =  nCx) 

n-*oo 


for  each  x,  with  T|(p)  ^  0  and  such  that  the  functions  l'jj(x)  are  equicontinuous, 
then  the  sequence 


converges  in  law  to  Y  . 


f  W  - 

\  iiV-) 


* 
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e'„(x) 


Proof:  We  consider  <p  (x)  =  a  . 

n'  n 


r„(x) 

Now  (p  (u)  =  a  ,  so  that  Condition  A  is  satisfied  since 
n n 

lim  =  1 


n-*oo 


(p) 


and  Condition  B  is  satisfied  by  restricting  attention  to  n  large  enough  that 


t|(h.) 


>  0  . 


To  check  Condition  C,  we  need  to  show  that 


(20) 


^  n'tt  I'  (|x)  ^ 
lim  - -  =  1 


n-^oo 


uniformly  for  x  c  [a,b]  .  We  show  that  the  numerator  can  be  made  arbitrarily 
close  to  the  non-zero  constant  and  that  will  suffice.  We  have 


n 


The  first  term  on  the  right  can  be  made  small  by  the  equicontinuity  of  at 

X  =  (1  and  the  second  can  be  made  small  by  the  convergence  of  to  . 

Since  we  have  proved  conditions  A,  B  and  C  we  could  invoke  Theorem  2 

rather  than  Theorem  1,  thus  showing  convergence  with  probability  1  of  the 

differences. 


41 
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5.  Applications  to  Poisson  Distributions.  In  this  section  we  consider 

the  problem  of  normalizing  the  Poisson  distribution  —  a  problem  in  which  there 

has  been  some  recent  interest  [6].  Let  X  have  a  Poisson  distribution  with 

X  -  n 

mean  n  .  Then,  of  course,  where  Y  has  the  standard 

X  ” 

normal  distribution.  Further  —  converges  in  probability  to  the  constant  1  . 

n 

A  standard  theorem  [2,  §20.6]  states  that  if  -  /(Y)  and  converges 

in  probability  to  the  non-zero  constant  c,  then  ^(Y^/Z^)  -  4^{Y/c)  .  These 
three  facts  can  be  combined  to  give  the  asymptotic  normality  of  a  great  many 
r.v. s,  e. g. , 


_  _  _  _  k  _ 

^X  -  '^n  ,  n/X  +  c  -  'v/n  +c,  ),  (n/X  +c 
n  ’  n  '  n  n 


-  Vn+  c  ),  and 
n 


R(l)  '  ^/n 


) 


where  R(x)  is  any  rational  function.  The  first  three  are  also  immediate 

consequences  of  Theorem  1.  Theorem  1  gives  other  examples.  We  give  an 

intuitively  appealing  form  of  Curtiss's  theorem  for  r.v.  s  with  Poisson  distributions. 

Theorem  6.  If  X  has  a  Poisson  distribution  with  mean  n  and  if  <p  lx) 
n  n'  ' 

is  a  sequence  of  continuous  real-valued  functions  defined  for  real  x,  such 
that 

(22)  lim  (f  (x  Vn  +  n)  N/n  =  1 

n-*oo 

uniformly  on  an  arbitrary  closed  and  bounded  interval  [a,b],  then  the  sequence 


/  ^  s»n(x)cbc 


of  laws  of 
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converges  to  that  of  a  standard  normal  r.v.  Y  . 

(Roughly  this  theorem  says  that  a  function  which  is  nearly  constant  within 
many  standard  deviations  of  the  mean  will  serve  as  integrand  for  a  normalizing 
transformation. ) 

To  prove  the  theorem,  we  show  that  (22)  implies  conditions  A,  B,  and  C  . 
Taking  x  =  0  in  (22)  we  have 

(23)  lim  ip  {n)'^  n  =1 

n-*oo  ” 

so  that  conditions  A  and  B  are  satisfied.  Thus  we  need  only  show  that  (22) 
Implies  condition  C  .  But,  in  virtue  of  (23),  this  reduces  to  showin;?  that 


»  (  +  n)  V  n  =  1 

n>^(n) 


uniformly  for  x  c  [a,b]  . 

Establishing  (24)  is  an  easy  application  of  the  "Moore-Osgood"  iterated 
limits  theorem  with  a  parameter  (4,  Theorem  VII.  4,  p.  102] .  We  consider  an 
arbitrary  finite  closed  interval  [a,b]  and  the  double  sequence  of  functions 


+  n  )  Vn 


€[a-e,b  +  c] 
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for  m  sufficiently  large,  say,m>mQ.  Consequently,  by  hypothesis, 

(26)  lim  ^  (X)  -  1 

^  m,n'  ' 

n-*oo  > 

uniformly  for  X€  [a,b]  and  m>mQ  .  Since  is  continuous  and  hence 

uniformly  continuous  on  finite  closed  intervals 

(27)  lim  iji  (x)  =  «)(x  V  n  +  n)  n/  n 
m-^oo  ’ 

uniformly  for  x  e  [a,b]  for  each  n  .  But  then  by  the  Moore-Osgood  theorem 
the  double  limit 


lim  (x) 

^  ^m,  n'  ' 
m-^«o  ’ 

n-*-oo 


exists  uniformly  for  x  <  [a,b]  and  is  equal  to  the  iterated  limit 


lim  lim  ib  (x)  . 

m-oen-oc 

But  this  iterated  limit  is  1,  according  to  (26)  .  In  particular,  then,  we  can 
assert  that  the  limit  of  the  "diagonal"  terms, 


(28)  lim  k|j  (x)  =  1 

n-*oo  "j’’ 

uniformly  for  x  c  [a,b]  .  Now  (28)  is  the  same  as  (24),  so  the  proof  is 
complete. 

(If  R(x)  is  any  rational  function,  not  vanishing  at  x  =  n  then  R(x)/(R(n)  “^n  ) 
satisfies  (22)  .  In  particular  if  R(x)  =  l/(x+  c)  one  has  that 


(■v/n  log(X__  +  c)  -  ^/n  lo9(n  +  c))  Is  ssymptoUoally  normal.  Thus  a  aerm  of 
the  reason  that  log  X  is  normalizing  for  Poisson  r.  v.  s  is  the  statement 
"Out  near  n,x/n  is  about  1.") 

(There  is  nothing  special  about  the  role  of  "^n  or  of  the  Poisson 
distribution  in  the  above  proof.  One  can  prove  a  general  theorem  by  exactly  the 
same  methods.) 

6.  Final  Remarks.  The  theorems  here  do  not  really  answer  the  important 
questions  of  data  transformation.  In  fact,  they  seem  to  raise  more  then  they 
solve,  for  they  indicate  a  tremendous  latitude  in  the  choice  of  the  function 
.„(x)  .  Thus,  if  is  any  function  satisfying  conditions  A,  B  and  C 

then  so  does,  for  instance,  <p^x)  R{x)/R{\i  where  R(x)  is  any  rational 
function  which  does  not  vanish  at  x  =  u  .  But  <p  {y)  =  tp  (ii  )  being  constant 
satisfies  C,  so  that  many  transformations  will  work. 

This  latitude  in  the  choice  of  y^(x),  the  fact  that  Theorem  2  says  that, 
asymptotically,  one  is  not  changing  Y^,  and  general  remarks  about  applying 
asymptotic  theorems  show  that  any  such  transformation  should  carry  with  it 
analysis  of  the  closeness  of  approximation.  The  large  body  of  work  done  on  this 
subject  is  particularly  reassuring. 
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